Skip to content

Conversation

@dotCipher
Copy link

@dotCipher dotCipher commented Nov 1, 2025

Real-Time Recording Visualization

Adds live visual feedback during voice recording to improve UX and help users understand what's happening during recording sessions.

What This Does

Shows real-time recording feedback with:

  • Audio level meter with color-coded RMS visualization (0-100%)
  • Recording duration vs. maximum time
  • Speech detection status (detected/not detected)
  • State indicators: WAITING (yellow) → ACTIVE (green) → SILENCE (blue)
  • Silence progress bar showing accumulation toward threshold
  • Minimum duration progress before silence can stop recording

Example

╭─────────────────────────────── 🎤 Recording... ───────────────────────────────╮
│                                                                              │
│     Duration:  3.2s / 120.0s                                                 │
│        State:  ACTIVE                                                        │
│       Speech:  ✓ Detected                                                    │
│  Audio Level:  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░░░░░░░░░░░░  72%                │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

Why This Matters

Users previously had no visual feedback during recording, leading to:

  • Uncertainty about whether the mic was working
  • Confusion about when to speak (only feedback was an audio beep)
  • Difficulty diagnosing audio issues
  • No visibility into the VAD state machine

This PR solves these issues by making the recording process transparent and providing clear visual cues.

Implementation

New module (voice_mode/recording_visualization.py):

  • RecordingVisualizer class with thread-safe updates
  • Live display at 10 FPS using the Rich library
  • Zero overhead when disabled

Integration (voice_mode/tools/converse.py):

  • Real-time RMS calculation in audio callback
  • State tracking (WAITING → ACTIVE → SILENCE)
  • Proper cleanup on completion and errors

Configuration:

  • Enabled by default for better UX
  • Can be disabled via VOICEMODE_RECORDING_VISUALIZATION=false

Dependencies:

  • Added rich>=13.0.0 for terminal UI

Testing

Manual test script included (test_visualization.py) that simulates:

  1. Waiting for speech phase
  2. Active recording with speech
  3. Silence accumulation until threshold

Documentation

Comprehensive guide added at docs/features/recording-visualization.md:

  • Feature overview and benefits
  • Configuration options
  • Technical details
  • Troubleshooting common issues

Changes

  • pyproject.toml - Added rich dependency
  • voice_mode/config.py - Added RECORDING_VISUALIZATION_ENABLED setting
  • voice_mode/recording_visualization.py - New visualization module
  • voice_mode/tools/converse.py - Integrated visualizer into recording flow
  • docs/features/recording-visualization.md - Feature documentation
  • test_visualization.py - Manual test script

@mbailey this is a mostly complete draft, I want to do some more extensive testing on it, but mainly wanted to get my idea down for some feedback 👍🏻 cheers!

- Add visual feedback during voice recording (audio levels, duration, speech detection)
- Integrate Rich library for live terminal UI
- Configurable via VOICEMODE_RECORDING_VISUALIZATION (enabled by default)
- Addresses product review Quick Win mbailey#4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant